Clustering with Normalized Cuts is Clustering with a Hyperplane
نویسندگان
چکیده
We present a set of clustering algorithms that identify cluster boundaries by searching for a hyperplanar gap in unlabeled data sets. It turns out that the Normalized Cuts algorithm of Shi and Malik [1], originally presented as a graph-theoretic algorithm, can be interpreted as such an algorithm. Viewing Normalized Cuts under this light reveals that it pays more attention to points away from the center of the data set than those near the center of the data set. As a result, it can sometimes split long clusters and display sensitivity to outliers. We derive a variant of Normalized Cuts that assigns uniform weight to all points, eliminating the sensitivity to outliers.
منابع مشابه
A Feature Space View of Spectral Clustering
The transductive SVM is a semi-supervised learning algorithm that searches for a large margin hyperplane in feature space. By withholding the training labels and adding a constraint that favors balanced clusters, it can be turned into a clustering algorithm. The Normalized Cuts clustering algorithm of Shi and Malik, although originally presented as spectral relaxation of a graph cut problem, ca...
متن کاملNormalized cuts clustering with prior knowledge and a pre-clustering stage
Clustering is of interest in cases when data are not labeled enough and a prior training stage is unfeasible. In particular, spectral clustering based on graph partitioning is of interest to solve problems with highly non-linearly separable classes. However, spectral methods, such as the well-known normalized cuts, involve the computation of eigenvectors that is a highly time-consuming task in ...
متن کاملAutomatically finding clusters in normalized cuts
Normalized Cuts is a state-of-the-art spectral method for clustering. By applying spectral techniques, the data becomes easier to cluster and then k-means is classically used. Unfortunately the number of clusters must be manually set and it is very sensitive to initialization. Moreover, k-means tends to split large clusters, to merge small clusters, and to favor convex-shaped clusters. In this ...
متن کاملA Hybrid Time Series Clustering Method Based on Fuzzy C-Means Algorithm: An Agreement Based Clustering Approach
In recent years, the advancement of information gathering technologies such as GPS and GSM networks have led to huge complex datasets such as time series and trajectories. As a result it is essential to use appropriate methods to analyze the produced large raw datasets. Extracting useful information from large data sets has always been one of the most important challenges in different sciences,...
متن کاملFeature Selection Framework for White Matter Fiber Clustering Based on Normalized Cuts
Due to its ability to automatically identify spatially and functionally related white matter fiber bundles, fiber clustering has the potential to improve our understanding of white matter anatomy. The normalized cuts (NCut) criterion has proven to be a suitable method for clustering fiber tracts. In this work, we show that the NCut value can be used for unsupervised feature selection as a measu...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2004